预测基金绩效对投资者和基金经理都是有益的,但这是一项艰巨的任务。在本文中,我们测试了深度学习模型是否比传统统计技术更准确地预测基金绩效。基金绩效通常通过Sharpe比率进行评估,该比例代表了风险调整的绩效,以确保基金之间有意义的可比性。我们根据每月收益率数据序列数据计算了年度夏普比率,该数据的时间序列数据为600多个投资于美国上市大型股票的开放式共同基金投资。我们发现,经过现代贝叶斯优化训练的长期短期记忆(LSTM)和封闭式复发单元(GRUS)深度学习方法比传统统计量相比,预测基金的Sharpe比率更高。结合了LSTM和GRU的预测的合奏方法,可以实现所有模型的最佳性能。有证据表明,深度学习和结合能提供有希望的解决方案,以应对基金绩效预测的挑战。
translated by 谷歌翻译
Diabetic Retinopathy (DR) is a leading cause of vision loss in the world, and early DR detection is necessary to prevent vision loss and support an appropriate treatment. In this work, we leverage interactive machine learning and introduce a joint learning framework, termed DRG-Net, to effectively learn both disease grading and multi-lesion segmentation. Our DRG-Net consists of two modules: (i) DRG-AI-System to classify DR Grading, localize lesion areas, and provide visual explanations; (ii) DRG-Expert-Interaction to receive feedback from user-expert and improve the DRG-AI-System. To deal with sparse data, we utilize transfer learning mechanisms to extract invariant feature representations by using Wasserstein distance and adversarial learning-based entropy minimization. Besides, we propose a novel attention strategy at both low- and high-level features to automatically select the most significant lesion information and provide explainable properties. In terms of human interaction, we further develop DRG-Net as a tool that enables expert users to correct the system's predictions, which may then be used to update the system as a whole. Moreover, thanks to the attention mechanism and loss functions constraint between lesion features and classification features, our approach can be robust given a certain level of noise in the feedback of users. We have benchmarked DRG-Net on the two largest DR datasets, i.e., IDRID and FGADR, and compared it to various state-of-the-art deep learning networks. In addition to outperforming other SOTA approaches, DRG-Net is effectively updated using user feedback, even in a weakly-supervised manner.
translated by 谷歌翻译
State space models (SSMs) have demonstrated state-of-the-art sequence modeling performance in some modalities, but underperform attention in language modeling. Moreover, despite scaling nearly linearly in sequence length instead of quadratically, SSMs are still slower than Transformers due to poor hardware utilization. In this paper, we make progress on understanding the expressivity gap between SSMs and attention in language modeling, and on reducing the hardware barrier between SSMs and attention. First, we use synthetic language modeling tasks to understand the gap between SSMs and attention. We find that existing SSMs struggle with two capabilities: recalling earlier tokens in the sequence and comparing tokens across the sequence. To understand the impact on language modeling, we propose a new SSM layer, H3, that is explicitly designed for these abilities. H3 matches attention on the synthetic languages and comes within 0.4 PPL of Transformers on OpenWebText. Furthermore, a hybrid 125M-parameter H3-attention model that retains two attention layers surprisingly outperforms Transformers on OpenWebText by 1.0 PPL. Next, to improve the efficiency of training SSMs on modern hardware, we propose FlashConv. FlashConv uses a fused block FFT algorithm to improve efficiency on sequences up to 8K, and introduces a novel state passing algorithm that exploits the recurrent properties of SSMs to scale to longer sequences. FlashConv yields 2$\times$ speedup on the long-range arena benchmark and allows hybrid language models to generate text 1.6$\times$ faster than Transformers. Using FlashConv, we scale hybrid H3-attention language models up to 1.3B parameters on the Pile and find promising initial results, achieving lower perplexity than Transformers and outperforming Transformers in zero- and few-shot learning on a majority of tasks in the SuperGLUE benchmark.
translated by 谷歌翻译
Collecting large-scale medical datasets with fully annotated samples for training of deep networks is prohibitively expensive, especially for 3D volume data. Recent breakthroughs in self-supervised learning (SSL) offer the ability to overcome the lack of labeled training samples by learning feature representations from unlabeled data. However, most current SSL techniques in the medical field have been designed for either 2D images or 3D volumes. In practice, this restricts the capability to fully leverage unlabeled data from numerous sources, which may include both 2D and 3D data. Additionally, the use of these pre-trained networks is constrained to downstream tasks with compatible data dimensions. In this paper, we propose a novel framework for unsupervised joint learning on 2D and 3D data modalities. Given a set of 2D images or 2D slices extracted from 3D volumes, we construct an SSL task based on a 2D contrastive clustering problem for distinct classes. The 3D volumes are exploited by computing vectored embedding at each slice and then assembling a holistic feature through deformable self-attention mechanisms in Transformer, allowing incorporating long-range dependencies between slices inside 3D volumes. These holistic features are further utilized to define a novel 3D clustering agreement-based SSL task and masking embedding prediction inspired by pre-trained language models. Experiments on downstream tasks, such as 3D brain segmentation, lung nodule detection, 3D heart structures segmentation, and abnormal chest X-ray detection, demonstrate the effectiveness of our joint 2D and 3D SSL approach. We improve plain 2D Deep-ClusterV2 and SwAV by a significant margin and also surpass various modern 2D and 3D SSL approaches.
translated by 谷歌翻译
Diffusion models are rising as a powerful solution for high-fidelity image generation, which exceeds GANs in quality in many circumstances. However, their slow training and inference speed is a huge bottleneck, blocking them from being used in real-time applications. A recent DiffusionGAN method significantly decreases the models' running time by reducing the number of sampling steps from thousands to several, but their speeds still largely lag behind the GAN counterparts. This paper aims to reduce the speed gap by proposing a novel wavelet-based diffusion structure. We extract low-and-high frequency components from both image and feature levels via wavelet decomposition and adaptively handle these components for faster processing while maintaining good generation quality. Furthermore, we propose to use a reconstruction term, which effectively boosts the model training convergence. Experimental results on CelebA-HQ, CIFAR-10, LSUN-Church, and STL-10 datasets prove our solution is a stepping-stone to offering real-time and high-fidelity diffusion models. Our code and pre-trained checkpoints will be available at \url{https://github.com/VinAIResearch/WaveDiff.git}.
translated by 谷歌翻译
By utilizing only depth information, the paper introduces a novel but efficient local planning approach that enhances not only computational efficiency but also planning performances for memoryless local planners. The sampling is first proposed to be based on the depth data which can identify and eliminate a specific type of in-collision trajectories in the sampled motion primitive library. More specifically, all the obscured primitives' endpoints are found through querying the depth values and excluded from the sampled set, which can significantly reduce the computational workload required in collision checking. On the other hand, we furthermore propose a steering mechanism also based on the depth information to effectively prevent an autonomous vehicle from getting stuck when facing a large convex obstacle, providing a higher level of autonomy for a planning system. Our steering technique is theoretically proved to be complete in scenarios of convex obstacles. To evaluate effectiveness of the proposed DEpth based both Sampling and Steering (DESS) methods, we implemented them in the synthetic environments where a quadrotor was simulated flying through a cluttered region with multiple size-different obstacles. The obtained results demonstrate that the proposed approach can considerably decrease computing time in local planners, where more trajectories can be evaluated while the best path with much lower cost can be found. More importantly, the success rates calculated by the fact that the robot successfully navigated to the destinations in different testing scenarios are always higher than 99.6% on average.
translated by 谷歌翻译
使用通过组成可逆层获得的地图进行标准化模型复杂概率分布。特殊的线性层(例如蒙版和1x1卷积)在现有体系结构中起着关键作用,因为它们在具有可拖动的Jacobians和倒置的同时增加表达能力。我们提出了一个基于蝴蝶层的新的可逆线性层家族,理论上捕获复杂的线性结构,包括排列和周期性,但可以有效地倒置。这种代表力是我们方法的关键优势,因为这些结构在许多现实世界数据集中很常见。根据我们的可逆蝴蝶层,我们构建了一个新的称为蝴蝶流的归一化流量模型。从经验上讲,我们证明蝴蝶不仅可以在MNIST,CIFAR-10和Imagenet 32​​x32等自然图像上实现强密度估计结果,而且还可以在结构化数据集中获得明显更好的对数可能性,例如Galaxy图像和Mimic-III患者群体 - - 同时,在记忆和计算方面比相关基线更有效。
translated by 谷歌翻译
我们介绍了第一项经验研究,研究了突发性检测对意向检测和插槽填充的下游任务的影响。我们对越南人进行了这项研究,这是一种低资源语言,没有以前的研究,也没有公共数据集可用于探索。首先,我们通过手动添加上下文不满并注释它们来扩展流利的越南意图检测和插槽填充phoatis。然后,我们使用强基线进行实验进行实验,以基于预训练的语言模型,以检测和关节意图检测和插槽填充。我们发现:(i)爆发对下游意图检测和插槽填充任务的性能产生负面影响,并且(ii)在探索环境中,预先训练的多语言语言模型XLM-R有助于产生更好的意图检测和插槽比预先训练的单语言模型phobert填充表演,这与在流利性环境中通常发现的相反。
translated by 谷歌翻译
最近,多模式和跨模式AI技术吸引了社区的注意。前者的目标是收集脱节和异质数据,以补偿互补信息以增强健壮的预测。后者的目标是利用一种模式来通过发现它们之间的共同关注共享来预测另一种方式。尽管两种方法都共享相同的目标:从收集的原始数据中生成智能数据,但前者需要更多的方式,而后者则旨在减少各种方式。本文首先讨论了多模式和跨模式AI在智能数据分析中的作用。然后,我们介绍了多模式和跨模式AI框架(MMCRAI),以平衡上述方法,并使其易于扩展到不同的域。该框架集成到XDATAPF(交叉数据平台https://www.xdata.nict.jp/)中。我们还介绍并讨论了基于此框架和XDATAPF的各种应用程序。
translated by 谷歌翻译
时间序列异常检测在统计,经济学和计算机科学中进行了广泛的研究。多年来,使用基于深度学习的方法为时间序列异常检测提出了许多方法。这些方法中的许多方法都在基准数据集上显示了最先进的性能,给人一种错误的印象,即这些系统在许多实用和工业现实世界中都可以强大且可部署。在本文中,我们证明了最先进的异常检测方法的性能通过仅在传感器数据中添加小的对抗扰动来实质性地降解。我们使用不同的评分指标,例如预测错误,异常和分类评分,包括几个公共和私人数据集,从航空航天应用程序,服务器机器到发电厂的网络物理系统。在众所周知的对抗攻击中,来自快速梯度标志方法(FGSM)和预计梯度下降(PGD)方法,我们证明了最新的深神经网络(DNNS)和图形神经网络(GNNS)方法,这些方法声称这些方法是要对异常进行稳健,并且可能已集成在现实生活中,其性能下降到低至0%。据我们最好的理解,我们首次证明了针对对抗攻击的异常检测系统的脆弱性。这项研究的总体目标是提高对时间序列异常检测器的对抗性脆弱性的认识。
translated by 谷歌翻译